Hypothesis Setting and Order Statistic for Robust Genomic Meta-analysis.

نویسندگان

  • Chi Song
  • George C Tseng
چکیده

Meta-analysis techniques have been widely developed and applied in genomic applications, especially for combining multiple transcriptomic studies. In this paper, we propose an order statistic of p-values (rth ordered p-value, rOP) across combined studies as the test statistic. We illustrate different hypothesis settings that detect gene markers differentially expressed (DE) "in all studies", "in the majority of studies", or "in one or more studies", and specify rOP as a suitable method for detecting DE genes "in the majority of studies". We develop methods to estimate the parameter r in rOP for real applications. Statistical properties such as its asymptotic behavior and a one-sided testing correction for detecting markers of concordant expression changes are explored. Power calculation and simulation show better performance of rOP compared to classical Fisher's method, Stouffer's method, minimum p-value method and maximum p-value method under the focused hypothesis setting. Theoretically, rOP is found connected to the naïve vote counting method and can be viewed as a generalized form of vote counting with better statistical properties. The method is applied to three microarray meta-analysis examples including major depressive disorder, brain cancer and diabetes. The results demonstrate rOP as a more generalizable, robust and sensitive statistical framework to detect disease-related markers.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Robust Optimal Desirability Approach for Multiple Responses Optimization with Multiple Productions Scenarios

  An optimal desirability function method is proposed to optimize multiple responses in multiple production scenarios, simultaneously. In dynamic environments, changes in production requirements in each condition create different production scenarios. Therefore, in multiple production scenarios like producing in several production lines with different technologies in a factory, various fitted r...

متن کامل

LINEAR HYPOTHESIS TESTING USING DLR METRIC

Several practical problems of hypotheses testing can be under a general linear model analysis of variance which would be examined. In analysis of variance, when the response random variable Y , has linear relationship with several random variables X, another important model as analysis of covariance can be used. In this paper, assuming that Y is fuzzy and using DLR metric, a method for testing ...

متن کامل

فراتحلیل اثربخشی آموزش‌ها و مداخلات روان‌شناختی و ورزشی بر میزان کیفیت زندگی بیماران مبتلا به دیابت نوع دو (ایران: 1392- 1382)

Background: One of the important indicators of diabetes treatment and control is enhancement of quality of life in patients with diabetes. Therefore, in recent years, quality of life in these patients regarded by therapists and researchers and increased studies in this field. The aim of this study was collection and integration of these studies results to investigate the effect size of sport an...

متن کامل

A randomization-based perspective of analysis of variance: a test statistic robust to treatment effect heterogeneity

Fisher randomization tests for Neyman’s null hypothesis of no average treatment effects are considered in a finite population setting associated with completely randomized experiments with more than two treatments. The consequences of using the F statistic to conduct such a test are examined both theoretically and computationally, and it is argued that under treatment effect heterogeneity, use ...

متن کامل

Consolidated Technique of Response Surface Methodology and Data Envelopment Analysis for setting the parameters of meta-heuristic algorithms - Case study: Production Scheduling Problem

    In this study, given the sequence dependent setup times, we attempt using the technique of Response Surface Methodology (RSM) to set the parameters of the genetic algorithm (GA), which is used to optimize the scheduling problem of n job on 1 machine (n/1). It aims at finding the most suitable parameters for increasing the efficiency of the proposed algorithm. At first, a central composite d...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • The annals of applied statistics

دوره 8 2  شماره 

صفحات  -

تاریخ انتشار 2014